Quick example on how to easily send a file using Zend Framework's Zend_Mail class.
When working with a lucene index using the Zend Framework's lucene search component you'll often in the course of the index's lifecycle want to update documents. This can prove tricky with the current implementation as there is no insitu update feature, you must first delete the old document and add a new one. The tricky part is locating the unique document you want to update. The 'old way' was as following:
// Retrieving documents with find() method using a query string $query = $idFieldName . ':' . $docId; $hits = $index->find($query); foreach ($hits as $hit) { $title = $hit->title; $contents = $hit->contents; }
This proves _painfully_ slow, you're loading the full index in an attempt to find a unique document with an ID. Even worse is if your unique ID happens to be a string such as a url or path. Since ZF 1.5, the 'best practice' direction to perform this type of task is to use the Zend_Search_Lucene::termDocs() method:
$term = new Zend_Search_Lucene_Index_Term('/somepath/somewhere', 'path'); $docIds = $index->termDocs($term); foreach ($docIds as $id) { $doc = $index->getDocument($id); $title = $doc->title; $contents = $doc->contents; }
Performance wise this proves much more efficient. However, unless you're careful at the indexing stage you may run into trouble when running termDocs() on a string value such as a URL or path as opposed to an integer ID. This is down to the field being added tokenized. This is the most common way fields are added and corresponds to:
$doc = new Zend_Search_Lucene_Document(); $doc->addField(Zend_Search_Lucene_Field::Text('title', $title));
If you want to use termDocs on an identifying field you need to add the field as type Keyword:
$doc = new Zend_Search_Lucene_Document(); $doc->addField(Zend_Search_Lucene_Field::Keyword('http://a.com/uri', 'uri'));
Keyword fields are not tokenized, and a term vector (which termDocs() requires) is stored, the distinction between the two field types is documented in Zend_Search_Lucene_Field's phpdocs:
Zend_Search_Lucene_Field::Text() constructs a String-valued Field that is tokenized and indexed, and is stored in the index, for return with hits. Useful for short text fields, like "title" or "subject". Term vector will not be stored for this field.
In contrast see:
Zend_Seach_Lucene_Field::Keyword() constructs a String-valued Field that is not tokenized, but is indexed and stored. Useful for non-text fields, e.g. date or url.
This caught me out a little bit until I dug around the source a little bit looking to see where termDocs was going wrong. Hopefully this helps save someone else some time, and hopefully Zend can update their documentation to draw other developers' attention to this quirk.